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ABSTRACT 

The paper describes the considerations that were taken into account in the development of a 
tentative English language textbook evaluation checklist. A brief review of the related literature 
precedes the crucial issues that should be considered in developing checklists. In the light of the 
previous evaluation checklists the developers created a list of the evaluative criteria on which the 
construct of the checklist could be established. The developers considered matters of validity, 
reliability and practicality in the process of its design; however, further research is in process to 
refine the checklist. Such an instrument could be used by curriculum designers, material 
developers and evaluators, as well as English language teachers. 
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INTRODUCTION 

to Sheldon (1988), we need to evaluate textbooks for two reasons. First, the evaluation will 
teacher or program developer in making decisions on selecting the appropriate textbook, 
lore, evaluation of the merits and demerits of a textbook will familiarize the teacher with its 
probable weaknesses and strengths. This will enable teachers to make appropriate adaptations to the material in their 
future instruction. In this line. Cunnings worth (1995) and Ellis (1997) propose that textbook evaluation can be of 
three types, namely ‘pre-use’, ‘in-use’, and ‘post-use’ evaluations. Evaluation of textbooks for pre-use, or predictive, 
purposes helps teachers in selecting the most appropriate textbook for a given language classroom by considering its 
prospective performance. The second type of evaluation aids the teacher to explore the weaknesses or strengths of 
the textbook while it is being used. Finally, post-use, or retrospective evaluation helps the teacher reflect on the 
quality of the textbook after it has been used in a particular learning-teaching situation. 

A checklist is an instrument that helps practitioners in English Language Teaching (ELT) evaluate 
language teaching materials, like textbooks. It allows a more sophisticated evaluation of the textbook in reference to 
a set of generalizable evaluative criteria. These checklists may be quantitative or qualitative. Quantitative scales 
have the merit of allowing an objective evaluation of a given textbook through Likert style rating scales (e.g., 
Skierso, 1991). Qualitative checklists, on the other hand, often use open-ended questions to elicit subjective 
information on the quality of course books (e.g., Richards, 2001). While qualitative checklists are capable of an in- 
depth evaluation of textbooks, quantitative checklists are more reliable instruments and are more convenient to work 
with, especially when team evaluations are involved. 

The review of textbook evaluation checklists within four decades (1970-2000) by Mukundan and Ahour 
(2010) revealed that most of the checklists are qualitative (e.g., Rahimy, 2007; Driss, 2006; McDonough & Shaw, 
2003; Rubdy, 2003; Garinger, 2002; Krug, 2002; McGrath, 2002; Garinger, 2001; Richards, 2001; Zabawa, 2001; 
Hemsley, 1997; Cunningsworth, 1995; Griffiths, 1995; Cunningsworth & Kusel, 1991; Harmer, 1991; Sheldon, 
1988; Breen & Candlin, 1987; Dougill, 1987; Hutchinson & Waters, 1987; Matthews, 1985; Cunningsworth, 1984; 
Bruder, 1978; Haycraft , 1978; Robinett , 1978); than quantitative (e.g., Canado & Esteban, 2005; Litz, 2005; 
Miekley, 2005; Harmer, 1998; Peacock, 1997; Ur, 1996; Skierso, 1991; Sheldon, 1988; Grant, 1987; Williams, 
1983; Daoud & Celce-Murcia, 1979; Tucker, 1978); or head words/outline format, i.e., those without rating scales or 
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questions (Ansari & Babaii, 2002; Littlejohn, 1998; Roberts, 1996; Brown, 1995). Most of these checklists are either 
too short or too long and some criteria in them are vague, so they do not thoroughly meet the requirements of a good 
and applicable instrument for evaluation purposes. 

Despite their crucial roles in language instruction, most if not all the available textbook evaluation 
checklists have been developed qualitatively often with no empirical evidence in support of their construct validity. 
Additionally, even when fundamental matters like validity and reliability are accounted for, most of these checklists 
are impractical. For example, some make use of ELT terminology that sound ambiguous for language instructors 
with little expertise in the area. A further disadvantage of some of the available checklists is that because of the high 
number of their items they lack economy and hence practicality. This could be the reason why most language 
learning materials in the world are evaluated based on the subjective and impressionistic judgment of evaluators. 

DETERMINING THE EVALUATIVE CRITERIA 

English language teaching (ELT) material developers and evaluators need to take a wide range of factors 
into consideration before they make decisions on the materials they develop or select for particular contexts. Some 
of these factors include the roles of the learner, teacher, and instructional materials as well as the syllabus (Richards 
& Rodgers, 1987). In order to account for these roles effectively, the evaluator must gain an awareness of the learner 
and teacher’s needs and interests (Bell & Gower, 1998). 

As it has been argued by some scholars (e.g., Byrd, 2001; Sheldon, 1988), evaluative criteria of checklists 
should be chosen according to the learning-teaching context and the specific needs of the learner and teacher. 
However, a review of the available checklists indicates that they have many identical evaluative criteria regardless 
of the fact that they had been developed in different parts of the world for different learning-teaching situations and 
purposes. Most well-established checklists such as Cunningsworth and Kusel (1991) or Skierso (1991) examine 
similar dimensions like physical attributes of textbooks including aims, layout, methodology, and organization. 
Some other criteria that are present in most checklists include the way language skills (speaking, listening, etc.), sub¬ 
skills (grammar, vocabulary, etc.), and functions are presented in the textbook depending on the present socio¬ 
cultural setting (Zabawa, 2001; Ur, 1996; Cunningsworth, 1995; Harmer, 1991). 

In addition to the criteria mentioned above, a checklist must take into account the background of the target 
students who are going to use it. The background can encompass a variety of dimensions including students’ age, 
needs and interests (Byrd, 2001; Skierso, 1991). Finally, the language used in the various texts of the textbook under 
evaluation should present natural and authentic examples of language use in the real world. According to Bell and 
Gower (1998), employing real language in the textbook contributes to the students’ motivation by helping the 
teacher “get them off the learning plateau” (p. 123). Based on the review of the literature on the textbook evaluation 
checklists, the researchers created a tentative classification of textbook evaluation criteria (Figure 1). 

As the figure shows, we divided the list of criteria into the two general categories including ‘general attributes’ and 
‘learning-teaching content’. The first category was further divided into five sub-categories of ‘relation to syllabus 
and curriculum’, ‘methodology’, ‘suitability to learners’, ‘physical and utilitarian attributes’, and ‘supplementary 
materials’. The criteria in the second category, on the other hand, included ‘general’ (i.e., task quality, cultural 
sensitivity, as well as linguistic and situational realism), ‘listening’, ‘speaking’, ‘reading’, ‘writing’, ‘vocabulary’, 
‘grammar’, ‘pronunciation’, and ‘exercises’. 
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Figure 1. Classification of textbook evaluation criteria 


A TENTATIVE CHECKLIST FOR TEXTBOOK EVALUATION 

A tentative checklist was developed based on the classification indicated in Figure 1. Appendix (A) 
presents the checklist which consists of two main sections including ‘general attributes’ and ‘learning-teaching 
content’. These sections were further divided into several sub-categories following the aforementioned 
classification. In order to avoid misinterpretations of these features, the developers added one or more descriptors to 
each sub-category. To offer an example, three descriptors (items I.C.4-6) were added under ‘suitability to learners’. 
These three items further would indicate a suitable textbook should account for learners’ age, needs and interests. 

In the development of this checklist several points had to be considered. First was the issue of validity. To 
ensure the validity or relevance of any instrument, its developers must be aware of the relevant theories (Messick, 
1994). In the light of their present teaching-learning situation, they should consider the construct domain being 
addressed and specify the criteria for evaluation of ELT materials, which, in turn, will show aspects of the 
evaluation procedure that may influence the final results and should, therefore, be closely considered. In order to be 
relevant to the context, a checklist should consider the purpose of evaluation, students, and other features discussed 
in the preceding section. The present checklist was developed based on a review of the similar previous instruments 
to ensure its construct validity. Meanwhile, certain items (e.g., items I.A. 1, I.C.4-6, II.A.4) in the checklist would 
bring evaluators’ attention to their present context. 

Tomlinson (2003) suggests avoiding large, vague, and dogmatic questions that might be interpreted 
differently by different evaluators. These factors, if eliminated in the trial process of the developed checklists may 
result in a more systematic, rigorous, and reliable evaluation. The clarity of the items should, therefore, be taken into 
consideration. A vague item can decrease the reliability of the instrument. There are certain checklists that fail to 
elaborate on some items which makes their comprehension very challenging for the novice evaluator. For instance, 
one of the items in Byrd (2001) describes the ‘Fit between textbook and the curriculum’ as, “fits the pedagogical and 
SLA philosophy of the program/course” (p.427). Such an item may be easily discernable for an expert in the area; 
however, it will not be clear enough for an end-user with a low expertise. Developers should seek to design clear 
items if they really wish their checklist to be utilized. As an example, Skierso (1991) clearly describes the criterion 
of ‘vocabulary load’, as "the number of new words introduced every lesson” (p. 446). This contributes to the clarity 
and, in turn, to the reliability of the instrument. 
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In addition to validity and reliability, checklist developers should also take matters of practicality into 
account. A checklist must be economical, for instance. Cunningsworth (1995) suggests, “It is important to limit the 
number of criteria used, the number of questions asked, to manageable proportions, otherwise we risk being 
swamped in a sea of details” (p. 5). If a checklist is precise and short, it will be expeditious and save considerable 
time and budget once it is utilized for evaluation purposes. According to Mukundan and Ahour (2010), the length of 
the quantitative textbook evaluation checklists in their study ranged between 113 (Tucker, 1978) and 4553 (Skierso, 
1991) running words. The developers of the present checklist made an attempt to come up with a relatively concise 
instrument. The number of running words in this checklist is 356, which turns out to be moderate compared to other 
similar checklists. This contributes to its economy. 

Another subject that should be noted in the operationalization of a checklist is the numerical value of its 
items. According to the literature, when faced with an odd-numbered scale, evaluators usually go for a middle score 
(McColloy & Remsted, 1965). This is known as the problem of “central tendency”; that is, “the inclination to rate 
people in the middle of the scale even when their performance clearly warrants a substantially higher or lower 
rating” (Grote, 1996, p. 138). To offer an example, in a five-point scale, an evaluator will more probably assign 3. 
Therefore, it is advisable to avoid odd-numbered scales when developing an instrument (Sager, 1972). However, as 
it is the case in most of the available checklists, a rating scale of 0-4 (where 4= Excellent, 3= Good, 2= Adequate, 1 = 
Weak, and 0= Totally lacking) is the dominant form employed (e.g., Daoud & Celce-Murcia, 1979; Skierso, 1991). 
Furthermore, the end-users of checklists can be informed of the problem of central tendency and be advised to avoid 
it. Having considered these variables, a scale developer can hope to come up with a fair and viable checklist. 

CONCLUSION 

This paper discussed the development of a tentative textbook evaluation checklist. For this purpose, the 
related literature was reviewed and a list of important evaluative criteria was created. This list was further developed 
by adding one or more items under each category to describe it in detail. A five-point scale was added to the 
checklist to help the evaluators assess the textbook in reference to each item. 

A checklist of this type could be useful for pre-use, in-use and post-use textbook evaluation purposes (Ellis, 
1997; Cunningsworth, 1995). Based on the results of such forms of evaluation, substantial educational and 
administrative decisions could be made that may have financial, professional, and/or political implications (Sheldon, 
1988). The checklist could prove informative and useful for curriculum designers, ELT material developers or 
teachers in the classroom providing them with useful ideas according to which the materials being evaluated can be 
improved. 

The present checklist as it appears in this paper can be further refined through qualitative and/or 
quantitative studies. Focus group interviews can help the developers improve the clarity of the instrument. 
Furthermore, a survey of ELT material experts’ evaluation of the checklist and a factor analysis of the collected data 
can provide empirical evidence for its inclusiveness. Finally, further research is also needed to test the correlation 
between the findings of the current checklist as compared to those of other well-established instruments. 
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Appendix A: A Tentative Checklist For Textbook Evaluation 


I. General attributes 

A. The book in relation to syllabus and curriculum 






1. It matches to the specifications of the syllabus. 


© 

© 

© 

© 

B. Methodology 






2. The activities can be exploited fully and can embrace the various methodologies in ELT. 


© 

© 


© 

3. Activities can work well with methodologies in ELT. 

® 

© 

© 

© 

© 

C. Suitability to learners 






4. It is compatible to the age of the learners. 

® 

© 

© 

© 

© 

5. It is compatible to the needs of the learners. 

® 

© 

© 

© 

© 

6. It is compatible to the interests of the learners. 

® 

© 

© 

© 

© 

D. Physical and utilitarian attributes 






7. Its layout is attractive. 

® 

© 

© 

© 

© 

8. It indicates efficient use of text and visuals. 

® 

© 

© 

© 

© 

9. It is durable. 

® 

© 

© 

© 

© 

10. It is cost-effective. 

® 

© 

© 

© 

© 

E. Efficient outlay of supplementary materials 






11. The book is supported efficiently by essentials like audio-materials. 

® 

© 

© 

© 

© 

II. Learning-teaching content 

A. General 






1. Most of the tasks in the book are interesting. 

® 

© 

© 

© 

© 

2. Tasks move from simple to complex. 

® 

© 

© 

© 

© 

3. Task objectives are achievable. 

® 

© 

© 

© 

© 

4. Cultural sensitivities have been considered. 

® 

© 

© 

© 

© 

5. The language in the textbook is natural and real. 

® 

© 

© 

© 

© 

6. The situations created in the dialogues sound natural and real. 

® 

© 

© 

® 

© 

B. Listening 






7. The book has appropriate listening tasks with well-defined goals. 

® 

© 

© 

© 

© 

8. Tasks are efficiently graded according to complexity. 

® 

© 

© 

© 

© 

9. Tasks are authentic or close to real language situations. 

® 

© 

© 

© 

© 

C. Speaking 






10. Activities are developed to initiate meaningful communication. 

® 

© 

© 

© 

© 

11. Activities are balanced between individual response, pair work and group work. 

® 

© 

© 

© 

© 

D. Reading 






12. Texts are graded. 

® 

© 

© 

© 

© 

13. Texts are interesting. 

® 

© 

© 

© 

© 

E. Writing 






14. Tasks have achievable goals and take into consideration learner capabilities. 

® 

© 

© 

© 

© 

15. Tasks are interesting. 

® 

© 

© 

© 

© 

F. Vocabulary 






16. The load (number of new words in each lesson) is appropriate to the level. 

® 

© 

© 

© 

© 

17. There is a good distribution (simple to complex) of vocabulary load across chapters and 

® 

© 

© 

© 

© 

the whole book. 






18. Words are efficiently repeated and recycled across the book. 

® 

© 

© 

® 

© 

G. Grammar 






19. The spread of grammar is achievable. 

® 

© 

© 

© 

© 

20. The grammar is contextualized. 

® 

© 

© 

© 

© 

21. Examples are interesting. 

® 

© 

© 

© 

© 

22. Grammar is introduced explicitly and reworked incidentally throughout the book. 

® 

© 

© 

© 

© 

H. Pronunciation 






23. It is contextualized. 

® 

© 

© 

© 

© 

24. It is learner-friendly with no complex charts. 

® 

© 

© 

© 

© 

I. Exercises 






25. They are learner friendly. 

® 

© 

© 

© 

© 

26. They are adequate. 

® 

© 

© 

© 

© 

27. They help students who are under/over-achievers. 

® 

© 

© 

© 

© 
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